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METHOD AND APPARATUS FOR COPYING 
DATA THAT RESIDES IN A DATABASE 

FIELD OF THE INVENTION 

The present invention relates to the copying of data and 5 
more specifically to producing a copy of data that resides in 
a database. 

BACKGROUND OF THE INVENTION 

Planning software is used by manufacturers to aid in the 10 
manufacturing process. Based upon the desired product 
output and the components needed for each product, the 
planning software generates a schedule of what components 
need to be manufactured and by when, and what materials 
need to be procured and by when. This schedule is generated 15 
based upon data stored in a database. The planning process 
is often complicated and, for complex products, can take 
many hours to complete. 

A typical requirement imposed by the planning software 
is that it needs to do its processing based on a single 
consistent version of the database. If one process of the 
planning software is reading one version of the database 
while another process is reading an updated version of the 
database, serious errors, such as double counting, can occur. 
As a result, planning software requires that data be provided 
from a single frozen version of the database in order to 
operate properly. 

As is well known, a database or a selected subset thereof 
can be frozen by obtaining exclusive locks on all of the 3Q 
tables in the database or the selective subset. Once locked, 
the tables can be processed by the planning software to carry 
out the planning process. However, as noted above, the 
planning process can take many hours to complete. Many 
companies, especially those having offices around the world, 35 
cannot afford to lock their tables for extended periods of 
time. Hence, locking tables in this manner is often not a 
viable solution. 

Another possible solution is to simply make a copy of the 
database prior to running the planning software. The prob- 4Q 
lem with this solution is that for large databases, the copying 
process itself can take several hours. During this time, the 
tables need to be locked to ensure a frozen state. As long as 
the tables are locked, no updates can be made. Hence, this 
solution suffers from the same shortcomings, albeit to a 45 
lesser degree, as the locking solution. 

Based on the foregoing, it is clearly desirable to provide 
a mechanism for obtaining a single frozen version of the 
database, or a subset thereof, without locking tables in the 
database for an extended period of time. 50 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, a method and 
apparatus for supplying a consistent set of data to a software 
application is provided. 55 

According to the method, a software application is 
launched that requires a particular set of data contained in a 
first database. Once the particular set of data is identified, a 
first process is requested to obtain a snapshot time from a 
database server associated with the first database. The snap- 60 
shot time causes all subsequent reads of the first database by 
the first process to return data that reflects a database state 
associated with the snapshot time. After the first process 
obtains the snapshot time, the first process extracts the 
particular set of data from the first database. The software 65 
application is then supplied with the particular set of data 
that was extracted from the first database. 



In one embodiment, a second process is used to store the 
particular set of data that was extracted into a second 
database. 

According to another aspect of the invention, a method 
and apparatus for producing a copy of data from a first 
database is provided. 

According to the method, a first set of data in the first 
database is locked. After locking the first set of data, a 
plurality of processes are requested to obtain snapshot times 
from a database server associated with said first database. 
The snapshot times cause all subsequent reads of the first 
database by the plurality of processes to return data from the 
first database as of the snapshot times. After waiting a 
particular period of time for the plurality of processes to be 
assigned snapshot times, the locks on the first set of data in 
the first database are released. 

The plurality of processes that were successful in obtain- 
ing a snapshot time within the particular period of time are 
used to extract a copy of the first set of data from the first 
database. The copy of the first set of data is then separately 
stored from the first of data. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention is illustrated by way of example, 
and not by way of limitation, in the figures of the accom- 
panying drawings and in which like reference numerals refer 
to similar elements and in which: 

FIG. 1 is a block diagram of a system that may be 
programmed to implement the present invention; 

FIG. 2 is a block diagram of a system that is used for 
producing a copy of a database in accordance with an 
embodiment of the present invention; 

FIG. 2A illustrates the use of a snapshot time in accor- 
dance with an embodiment of the invention; 

FIG. 2B illustrates the locking a database to obtain 
snapshot times that correspond to a single database state; 

FIG. 3A is a portion of a flow diagram illustrating a 
method for producing a copy of a database without retaining 
a lock on the database tables according to an embodiment of 
the invention; 

FIG. 3B is another portion of a flow diagram illustrating 
a method for producing a copy of a database without 
retaining a lock on the database tables according to an 
embodiment of the invention; 

FIG 4 is a block diagram of a system that is used for 
producing a copy of a database in accordance with an 
embodiment of the present invention; 

FIG. 5A is a portion of a flow diagram illustrating a 
method for producing a copy of a database without locking 
the database tables; 

FIG SB is another portion of a flow diagram illustrating 
a method for producing a copy of a database without locking 
the database tables; and 

FIG 6 is a block diagram of alternate system for produc- 
ing a copy of a database in accordance with an embodiment 
of the present invention. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENT 

A method and apparatus for producing a consistent copy 
of a database, or portion thereof, is described. In the fol- 
lowing description, for the purposes of explanation, numer- 
ous specific details are set forth in order to provide a 
thorough understanding of the present invention. It will be 
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apparent, however, to one skilled in the art that the present 
invention may be practiced without these specific details. In 
other instances, well-known structures and devices are 
shown in block diagram form in order to avoid unnecessarily 
obscuring the present invention. 5 

Hardware Overview 

FIG. 1 is a block diagram that illustrates a computer 
system 100 upon which an embodiment of the invention 
may be implemented. Computer system 100 includes a bus 
102 or other communication mechanism for communicating 
information, and a processor 104 coupled with bus 102 for 
processing information. Computer system 100 also includes 
a main memory 106, such as a random access memory 
(RAM) or other dynamic storage device, coupled to bus 102 
for storing information and instructions to be executed by 
processor 104. Main memory 106 also may be used for 
storing temporary variables or other intermediate informa- 
tion during execution of instructions to be executed by 
processor 104. Computer system 100 further includes a read 
only memory (ROM) 108 or other static storage device 
coupled to bus 102 for storing static information and instruc- 
tions for processor 104. A storage device 110, such as a 
magnetic disk or optical disk, is provided and coupled to bus 
102 for storing information and instructions. 

Computer system 100 may be coupled via bus 102 to a 
display 112, such as a cathode ray tube (CRT), for displaying 
information to a computer user. An input device 114, includ- 
ing alphanumeric and other keys, is coupled to bus 102 for 30 
communicating information and command selections to 
processor 104. Another type of user input device is cursor 
control 116, such as a mouse, a trackball, or cursor direction 
keys for communicating direction information and com- 
mand selections to processor 104 and for controlling cursor 35 
movement on display 112. This input device typically has 
two degrees of freedom in two axes, a first axis (e.g., x) and 
a second axis (e.g., y), that allows the device to specify 
positions in a plane. 

The invention is related to the use of computer system 100 40 
for producing a copy of a database. According to one 
embodiment of the invention, a copy of a database is 
produced by computer system 100 in response to processor 
104 executing one or more sequences of one or more 
instructions contained in main memory 106. Such instruc- 45 
tions may be read into main memory 106 from another 
computer-readable medium, such as storage device 110. 
Execution of the sequences of instructions contained in main 
memory 106 causes processor 104 to perform the process 
steps described herein. In alternative embodiments, hard- 50 
wired circuitry may be used in place of or in combination 
with software instructions to implement the invention. Thus, 
embodiments of the invention are not limited to any specific 
combination of hardware circuitry and software. 

The term "computer-readable medium" as used herein 55 
refers to any medium that participates in providing instruc- 
tions to processor 104 for execution. Such a medium may 
take many forms, including but not limited to, non-volatile 
media, volatile media, and transmission media. Non-volatile 
media includes, for example, optical or magnetic disks, such 60 
as storage device 110. Volatile media includes dynamic 
memory, such as main memory 106. Transmission media 
includes coaxial cables, copper wire and fiber optics, includ- 
ing the wires that comprise bus 102. Transmission media can 
also take the form of acoustic or light waves, such as those 65 
generated during radio-wave and infra-red data communi- 
cations. 
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Common forms of computer-readable media include, for 
example, a floppy disk, a flexible disk, hard disk, magnetic 
tape, or any other magnetic medium, a CDROM, any other 
optical medium, punchcards, papertape, any other physical 
medium with patterns of holes, a RAM, a PROM, and 
EPROM, a FLASH-EPROM, any other memory chip or 
cartridge, a carrier wave as described hereinafter, or any 
other medium from which a computer can read. 

Various forms of computer readable media may be 
involved in carrying one or more sequences of one or more 
instructions to processor 104 for execution. For example, the 
instructions may initially be carried on a magnetic disk of a 
remote computer. The remote computer can load the instruc- 
tions into its dynamic memory and send the instructions over 
a telephone line using a modem. A modem local to computer 
system 100 can receive the data on the telephone line and 
use an infra-red transmitter to convert the data to an infra-red 
signal. An infra-red detector can receive the data carried in 
the infra-red signal and appropriate circuitry can place the 
data on bus 102. Bus 102 carries the data to main memory 
106, from which processor 104 retrieves and executes the 
instructions. The instructions received by main memory 106 
may optionally be stored on storage device 110 either before 
or after execution by processor 104. 

Computer system 100 also includes a communication 
interface 118 coupled to bus 102. Communication interface 
118 provides a two-way data communication coupling to a 
network link 120 that is connected to a local network 122. 
For example, communication interface 118 may be an inte- 
grated services digital network (ISDN) card or a modem to 
provide a data communication connection to a correspond- 
ing type of telephone line. As another example, communi- 
cation interface 118 may be a local area network (LAN) card 
to provide a data communication connection to a compatible 
LAN. Wireless links may also be implemented. In any such 
implementation, communication interface 118 sends and 
receives electrical, electromagnetic or optical signals that 
carry digital data streams representing various types of 
information. 

Network link 120 typically provides data communication 
through one or more networks to other data devices. For 
example, network link 120 may provide a connection 
through local network 122 to a host computer 124 or to data 
equipment operated by an Internet Service Provider (ISP) 
126. ISP 126 in turn provides data communication services 
through the world wide packet data communication network 
now commonly referred to as the "Internet" 128. Local 
network 122 and Internet 128 both use electrical, electro- 
magnetic or optical signals that carry digital data streams. 
The signals through the various networks and the signals on 
network link 120 and through communication interface 118, 
which carry the digital data to and from computer system 
100, are exemplary forms of carrier waves transporting the 
information. 

Computer system 100 can send messages and receive 
data, including program code, through the network(s), net- 
work link 120 and communication interface 118. In the 
Internet example, a server 130 might transmit a requested 
code for an application program through Internet 128, ISP 
126, local network 122 and communication interface 118. In 
accordance with the invention, one such downloaded appli- 
cation provides for producing a copy of a database as 
described herein. 

The received code may be executed by processor 104 as 
it is received, and/or stored in storage device 110, or other 
non-volatile storage for later execution. In this manner, 
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computer system 100 may obtain application code in the 215) to obtain a snapshot time from the database manage- 

form of a carrier wave. ment system (DBMS). Snapshot times are assigned by the 

database management system, and are used by the database 

Functional Overview management system to determine what version of data to 

The present invention provides a mechanism for produc- 5 supply to processes. For example, the version of data items 

ing a consistent copy of a database, or portion thereof, tnat supplied to a process assigned a snapshot time of Ti 

without locking the database or desired portion for an ^ reflect the state of those data items as of time Tl , even 

extended period of time. For the purpose of explanation, if tDe data i^ms have been subsequently modified, 

embodiments of the invention shall be described with ref- Snapshot times are used to guarantee that a particular 

erence to a relational database that stores data in tables. The 10 version of data is given to a process (i.e. snapshot worker), 

database portion of which a consistent copy is required is Thus, if a process requests data that has been modified since 

therefore referred to herein as the "desired tables". However, obtaining its snapshot time, the DBMS reconstructs the state 

the techniques described herein are not limited to tables or of data as of the snapshot time and supplies the reconstructed 

relational databases. data to the process. 

In one embodiment, the following steps are performed to 15 For example, FIG. 2A illustrates the use of a snapshot 

obtain a snapshot (consistent copy) of a database: time in accordance with an embodiment of the invention. In 

(1) the desired tables (e.g. the tables that contain infer- FIG. 2A, DB_A, DB_B, DB_C, DB_D, DB_E and 
mation required by a memory-based planner) are identified; DB_F respectively represent different states of database 204 

(2) the desired tables are locked to prevent them from 20 at time T0 » n - ^ n » T4 and T5. If a snapshot time is 
being updated* issued at time TO and a read is subsequently performed at 

(3) a coordinator requests a plurality of snapshot worker * me ^ P****? T P ° Dd '°r ^ '™ F 'AoT? 
processes to obtain snapshot times, where the snapshot times \ u ■ * ^ P^ 
allow the snapshot workers to obtain data from the database ™ e P?. my ° f mechanisms, 
as of a particular state of the database; 25 According to one embodiment, the DBMS uses log entries 

... A . ,. . . ...... to reconstruct the state of the requested data items to reflect 

(4) the coordinator then waits a certain period of time for ^ ^ at ^ m fa eff ^ hot ^ enables a 

the requested plurahty of snapshot workers to obtain snap- ^ , 0 xe , ^ q{ a f a icular 

shot times; ^ 

(5) the coordinator then releases the locks on the desired ^ Qne mechanism by which , pBMS can reconstruct a 
es * *" particular state of the database is described in detail in U.S 

(6) using the snapshot workers that obtained a snapshot p atent Applications "Method and Apparatus for Providing 
time before the locks were released (the "successful snap- isolation Levels in a Database", having Ser. No. 08/613,026, 
shot workers"), a copy of the desired tables is produced f^ cd Mar n> 1996 - "Techniques for Providing the Number 
which represents the state of the database at the time desired of Snapshots of a Database", having Ser. No. 08/838,967, 
tables were locked. filed Apr> 2 3, 1997, and "Dynamic Snapshot Set 

System Overview Adjustment", having Ser. No. 08/841,541, filed Apr. 23, 

1997, which are hereby incorporated by reference. 

FIG. 2 depicts a block diagram of a system 200 that is ^ e coordinator 216 requests the plurality of snap- 
used for producing a copy of desired tables in accordance ^ shot wor kers (206, 208, 210, 212, 214 and 215) to obtain 
with an embodiment of the present invention. snapshot times, it then waits for a certain period of time 

As depicted in FIG. 2, a memory based planner 202 has (T_WAIT) for the snapshot times to be assigned to the 

been launched by a user to generate a planning schedule that snapshot workers by the DBMS. The snapshot times that the 

is based upon a snapshot of database 204. Typically, gen- DBMS assigns to the snapshot workers approximately 

eration of a planning schedule will only require data from a 45 reflect the times at which the snapshot time assignments are 

subset of the tables contained in database 204. Therefore, a made. Since the snapshot times are being assigned after the 

copy table list 218 is maintained which lists the database desired tables are locked, and before the locks on the desired 

tables (L.N) in database 204 that are required in the gen- tables are released, the snapshot times assigned to the 

eration of the planning schedule (the "desired tables"). snapshot workers will correspond to a time at which the 

A coordinator 216 interfaces with individual worker 50 desired tables were locked, 

processes, snapshot workers (206-214), loader workers In certain embodiments, the snapshot workers obtain 

(230-238) and delete workers(242-248), and is responsible snapshot times by executing a "set transaction read only" 

for the overall coordination of producing a copy of database command. Once a process issues a "set transaction read 

204 for use in generation of the planning schedule. only" command to the DBMS, that process is ensured by the 

. 55 DBMS that all subsequent reads by that process for that 

Operation or System 200 transaction will return data that was in the database as of the 

To produce a snapshot of database 204, data copy table time the "set transaction read only" command was issued. 

218 is generated which identifies the desired set of data In certain database systems, using the "set transaction 

tables (1..N) in database 204. The desired set of tables listed read only" command has certain limitations. Specifically, the 

in data copy table 218 are the tables that contain information 60 process that issued the command cannot perform any data- 

that is needed by the memory based planner 202 in order to base modifications (defined as any inserts, updates, or 

generate the planning schedule. Using the information con- deletes). If it does perform database modifications, it loses 

tained in the data copy table 218, the coordinator 216 locks the benefits of the command (i.e. loses its snapshot time), 

each desired table (L.N). The locks thus obtained prevent Because of this, it is not possible to use the "set transaction 

other processes from modifying the desired tables (L.N). $5 read only" command with a transaction that makes copies 

Once the locks are obtained, the coordinator 216 requests within a database of the relevant tables (since copying would 

a plurality of snapshot workers (206, 208, 210, 212, 214 and involve writing into a database). 
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To circumvent this problem, the successful snapshot 
workers read the data from the database, and then write the 
data to one or more flat files outside of the database. Because 
writing to a flat file is an operating system command, and not 
a DBMS command, it does not constitute a "write" operation 5 
as far as the DBMS is concerned. Hence, the snapshot times 
obtained through the "set transaction read only" command 
remains in effect for each snapshot worker. 

When the (T_WAJT) time period expires, the coordinator 
216 releases its locks on the tables. Ai this time, some of the 1 
snapshot workers that were requested to obtain snapshot 
times may not yet have been assigned snapshot times. To 
ensure that all processes that participate in the creation of 
copies of the relevant tables are seeing the same version of 
the relevant tables, only those snapshot workers that have 
been assigned snapshot times prior to the expiration of 1: 
T_WAIT (the "successful snapshot workers") are used in 
retrieving data from database 204. For the purpose of 
explanation, it shall be assumed that snapshot workers 
206-214 successfully obtained snapshot times, and that 
snapshot worker 215 did not. 2( 

Because the tables identified in the copy table list 218 
were locked at the time the successful snapshot workers 
were assigned their snapshot times, the successful snapshot 
workers are guaranteed that when they retrieve data from the 
desired table, they will retrieve data which corresponds to a 2i 
single database state. 

For example, FIG. 2B illustrates locking database 204 to 
obtain snapshot times that correspond to a single database 
state in accordance with an embodiment of the invention. In 
FIG. 2B, DB_A, DB_B and DB_C represent different 
states of the relevant tables within database 204. In this 
example, the desired tables in database 204 are locked by 
coordinator 216 at time Tl for a T__WAIT period of "4", 
and, therefore cannot be updated by another process until 35 
T5. The coordinator 216 maintains its locks on the desired 
tables in database 204 until T5. Thus, if snapshot workers 
206, 208 and 210 obtain snapshot times at Tl, T3 and T4, 
they will all retrieve data which corresponds to a single 
database state (DB_JB) from the desired tables. ^ 

In one embodiment, the successful snapshot workers are 
not required to wait until the end of the T_WATT period of 
time before they begin to retrieve data from database 204. 
For example, referring to FIG. 2B, if a snapshot worker 
obtains a snapshot time at T2, the snapshot worker may 45 
begin to retrieve data from the desired tables before the 
tables are unlocked at T5. 

Using the snapshot times, the successful snapshot workers 
(206, 208, 210, 212 and 214) retrieve data from database 204 
and copy it into a plurality of flat files (220, 22, 224, 226 and 50 
228). The coordinator 216 is responsible for coordinating the 
particular data (i.e. the tables identified in data copy table 
218) that is retrieved by each of the successful snapshot 
workers (206, 208, 210, 212 and 214). 

When any flat file of the plurality of flat files (220, 22, 55 
224, 226 and 228) has been completed, one of a plurality of 
loader workers (230, 232, 234, 236 and 238) load the flat 
files into one of a plurality of snapshot tables (1..N) in 
snapshot table database 240. The plurality of snapshot tables 
(1..N) in snapshot table database 240 are mapped on a $o 
one-to-one basis with the desired database tables (1..N) in 
database 204. In certain embodiments, the loader workers 
are SQL loaders that write data from the flat files directly 
into the snapshot tables (1..N) in the snapshot table database 
240. 65 

Before the data from a flat file thus created can be loaded 
into a particular snapshot table in snapshot table database 
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240, any data that was previously stored in the particular 
snapshot table (i.e. "stale data" from a prior execution of the 
planning software process 202) must first be removed. To 
perform this task, a plurality of delete workers (242, 244, 
246 and 248) are used to delete previously stored snapshot 
tables from snapshot table database 240. 

Once the previously stored snapshot tables are deleted and 
the flat files are loaded into the snapshot tables in the 
snapshot table database 240, snapshot table database 240 
1 contains a consistent snapshot of the desired tables identified 
in data copy table 218 as of a particular point in time. 

In a preferred embodiment, once the desired tables are 
copied into the flat files, the memory based planner 202 
reads the flat file information to generate a planning sched- 
ule. In an alternative embodiment, the memory based plan- 
ner 202 waits for the flat files to be copied into the snapshot 
tables in snapshot table database 240 and then reads the 
snapshot table information to generate a planning schedule. 

Sequence for Producing a Copy of a Database 

FIG. 3 A and 3B are flow diagrams illustrating a method 
for producing a copy of desired data from database 204 
according to an embodiment of the invention. For explana- 
tion purposes, the flow diagrams of FIGS. 3A and 3B are 
described with reference to the components of FIG. 2. 

At step 302, a memory based planner 202 is executed by 
a user to generate a planning schedule that is based upon 
data from database 204. At step 304, the memory based 
planner 202 causes a snapshot worker 206 to be launched. 

At step 306, the snapshot worker 206 determines which 
tables of database 204 are required by the memory based 
planner 202 for generating the planning schedule. Using this 
information, the snapshot worker 206 generates a copy table 
list 218. 

In one embodiment, each entry in the copy table list 218 
includes a table ID 250, a delete start time 252 and a delete 
complete time 254. The table ID 250 identifies a particular 
table in database 204 that is required by the memory based 
planner 202 for the generation of the planing schedule. The 
delete start time 252 identifies a system timestamp in which 
a delete worker began to delete a corresponding snapshot 
table (to remove "stale data") in snapshot table database 240, 
as identified by the particular table ED. For example, entry 
256 of copy table list 218 indicates that a delete worker 
began to delete snapshot table 15 in snapshot table database 
240 at the system time of 900. 

The delete complete time 254 identifies a system times- 
tamp in which a delete worker completed the deletion of a 
snapshot table in snapshot table database 240, as identified 
by the particular table ID. For example, entry 256 of copy 
table list 218 indicates that a delete worker completed the 
deletion of snapshot table 15 in snapshot table database 240 
at the system time of 950. The delete start time 252 and 
delete complete time 254 entries are used by the coordinator 
216 to determine whether data can be loaded into a particular 
snapshot table (L.N). 

At step 308, the snapshot worker 206 spawns a coordi- 
nator process 216 for coordinating the copying of data from 
database 204. At step 309, the snapshot worker 206 notifies 
the coordinator 216 that the copy table list 218 has been 
created. At step 310, the coordinator 216 obtains locks on the 
tables identified in the copy table list 218. 

At step 312, after obtaining a lock on each table identified 
in copy table list 218, the coordinator 216 requests a 
plurality of snapshot workers (206-214) to solicit snapshot 
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times from the DBMS. In one embodiment, upon executing flat files, informing the coordinator 216 each time one of the 

a memory based planner, the user may specify a particular flat files has been completed. 

number of snapshot workers for the coordinator 216 to At step 324, the coordinator 216 determines whether the 

request to solicit snapshot times. delete workers (242-248) have deleted the snapshot tables 

At step 314, the coordinator waits a certain period of time 5 (i.e. "stale data") in the snapshot table database 240 that 

(T_WAIT) for the requested snapshot workers (206-214) to correspond to the data that the snapshot worker has just 

be assigned snapshot times. For example, in one completed copying. For example, if the snapshot worker 

embodiment, the coordinator 216 waits several minutes after informs the coordinator 216 that database tables "1", "5", 

requesting the plurality of snapshot workers (206-214) to and "8" in database 204 have just been copied to the flat file' 
solicit snapshot times from the DBMS. 10 222, the coordinator 216 determines whether the delete 

At step 316, after the (T__WAn) period of time expires, workers (242-248) have removed the "stale data" by delet- 

the coordinator releases the locks on the tables identified in ing snapshot tables "1", "5" and "8" in snapshot table 

copy table list 218. database 240. 

At step 318, the coordinator 216 uses the copy table list In certain embodiments, the coordinator 216 uses the 

218 to assign the successful snapshot workers (in this 15 information in the copy table list 218 to determine if a 

example, snapshot workers 206-214), aparticular set of data particular snapshot table has been deleted. In one 

to copy from database 204. For example, snapshot worker embodiment, the coordinator 216 uses the delete complete 

206 may be assigned to copy database table "17" from time 254 parameter to determine if a snapshot table has been 

database 204. Once assigned a set of data, the successful deleted. 

snapshot workers (206-214) begin to copy the assigned data 20 If at step 324 the coordinator 216 determines that a 

from database 204 into flat files (220-228). snapshot table has not been deleted (i.e. snapshot tables "1", 

In certain embodiments, the coordinator 216 assigns the "5" and "8" for the previous example), then at step 326, the 

snapshot workers a particular set of data to copy before the coordinator 216 delays the launching (e.g. spawning) of a 

(T_WAIT) period of time expires. For example, the coor- 2$ loader worker (230-238) for loading the information until 

dinator could assign a particular copying task at the same me snapshot tables "1", "5" and "8" have been deleted. For 

time that the coordinator requests the snapshot workers to example, assuming that snapshot worker 206 has notified 

obtain snapshot times. Without waiting to the T__WATT coordinator 216 that it has copied database tables "1", "5" 

period to expire, each snapshot worker can begin the copy- and "8" from database 204 into flat file 220, if coordinator 

ing task as soon as (1) it is assigned the set of data copy, and 3Q 216 determines that snapshot table "5" in snapshot table 

(2) it has successfully obtained a snapshot rime. database 240 has not been deleted (i.e. the "stale data" has 

At step 320, the coordinator launches delete workers not Deen removed), then coordinator 216 will delay launch- 
(242-248) to delete "stale data" contained in the snapshot loader worker 230 for loading the information into 
tables in snapshot table database 240. In deleting the "stale snapshot tables "1", "5" and "8", until snapshot table "5" has 
data", the delete workers use the copy table list 218 to 35 Deen deleted. Control then proceeds to step 338. 
identify which snapshot tables in snapshot table database If at step 324, the coordinator 216 determines that the 
240 need to be deleted. In one embodiment, the delete snapshot table has been deleted, then at step 328 the coor- 
workers (242-248) respectively update the delete start time dinator 216 launches a loader worker (230-238) to load the 
252 and delete complete time 254 parameters for the cor- information from the flat file into the corresponding snap- 
responding entry as they begin and complete the deletion of ^ shot table in snapshot table database 240. 
a particular table identified in copy table list 218. At step 330, when a loader worker (230-238) finishes 
For example, the delete worker 242 may use the infor- loading a flat file (22G-228) into its corresponding snapshot 
mation at entry 258 of copy table list 218 to identify a file in snapshot table database 240, it notifies the coordinator 
snapshot table "7" in snapshot table database 240 as a table 216. In certain embodiments, the loader workers (230-238) 
that needs to be deleted. The delete worker 242 then stores 45 notify the coordinator 216 of the particular snapshot table in 
the current system time ("1000" in this example) in the which the data was loaded. 

delete start time 252 parameter of entry 258 and begins At step 332, the coordinator 216 determines whether all 

deleting the snapshot table "7". When delete worker 242 the desired tables identified in copy table list 218 have been 

completes the deletion of snapshot table "7", it stores the copied from database 204 into the flat files (220-228). 

current system time ("1130" in this example) in the delete 50 If at step 332 the coordinator 216 determines that all the 

complete time 254 parameter of entry 258. tables identified in copy table list 218 have been copied, then 

It should be noted that although the delete workers at step 334, the coordinator 216 notifies the memory based 

(242-248) are launched at step 320 in this example, the planner 202 that the flat files (220-228) contain a valid copy 

coordinator 216 may actually launch them prior to or even of the desired tables from database 204. At step 336, the 

after step 320. For example, in one embodiment, the coor- 55 memory based planner 202 uses the information in the flat 

dinator 216 launches the delete workers (242-248) imme- files (220-228) to generate the planning schedule, 

diately after it is notified that the copy table list 218 has been In an alternative embodiment, at step 332, the coordinator 

created in step 308. 216 determines whether all the desired tables identified in 

At step 322, a snapshot worker notifies the coordinator copy table list 218 have been copied from database 204 into 

216 it has finished copying the data it was assigned. In one 60 the snapshot tables in snapshot table database 240. If the 

embodiment, the snapshot worker informs the coordinator coordinator 216 determines that all the tables identified in 

216 of the data that was copied (i.e. the particular database copy table list 218 have been copied, then at step 334, the 

tables in database 204) and the location of the one or more coordinator 216 notifies the memory based planner 202 that 

flat files that contains the copy of the data. Although FIG. 2 the snapshot table database 240 contains a valid copy of the 

depicts each snapshot worker using a single flat file for 65 desired tables from database 204. At step 336, the memory 

copying its assigned data, in certain embodiments, the based planner 202 then uses the snapshot tables in snapshot 

snapshot workers may copy their assigned data to multiple table database 240 to generate the planning schedule. 
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If at step 332 the coordinator 216 determines that all the The delete complete time 254 identifies a system time 

tables identified in copy table list 218 have not been copied, stamp in which a delete worker completed the deletion of a 

then at step 338, the coordinator 216 continues to accept and snapshot table in snapshot table database 240, as identified 

process other completion notifications from snapshot work- by the particular table ID. For example, entry 256 of copy 
ers (206-214). When the coordinator 216 receives a notifi- 5 table list 218 indicates that a delete worker completed the 

cation that a snapshot worker (206-214) has finished copy- deletion of snapshot table 15 in snapshot table database 240 

ing its assigned data to a particular flat file, control proceeds at the system time of 950. The delete start time 252 and 

to step 324 to determine whether the delete workers delete complete time 254 entries are used by the coordinator 

(242-248) have deleted the corresponding snapshot tables in 216 to determine whether data can be loaded into a particular 
the snapshot table database 240. 10 snapshot table (L.N). 

Producing a Copy without Locking Database M stc P 508 ' the snapshot worker 206 spawns a coordi- 

Tables nator process 216 for coordinating the copying of data from 

database 204. At step 509, the snapshot worker 206 notifies 

In the embodiment described above, the desired tables are ^ c coordinator 216 that the copy table list 218 has been 
locked for a relatively brief period T_WAIT while snapshot 15 created 

workers obtain snapshot times^ In an alternative M st 510( ^ coa&im[ 216 ^ ^ hot 

embodiment a mechamsm <s provided for producing a copy workefs 20fi , Q ^ a ^ times \ om the DB * s 

of desired tables from a database without locking the desired A( 4 C11 tL / . e . 

f , , r- t • t j • , f ,? . . At step 512, the coordinator waits for the snapshot worker 

tables. For example, in one embodiment, the following steps , / . , ... F 

r j * L • i_ * r j 1- **t_ . 206 to be assigned a snapshot time, 

are performed to obtain a snapshot of a database without 20 A M . , ~\ 

locking the desired tables- M ste P 514 ' ±c coordinator 216 uses the copy table list 

(1) the desired tables are identified; J" * "J J* Sn ^ 1 W0 * er ™> a P art j cular « rf 
; ' . . , , \ . . data to copy from database 204. For example, snapshot 

(2) a single snapshot worker process is requested to obtain WOfker 2 06 may be assigned to copy database table "17" 
a snapshot time; from database 2 04. Once assigned a set of data, the snapshot 

(3) using the single snapshot worker, a copy of desired 2 worker 206 begins to copy the assigned data from database 
tables is produced which reflects the state of the database 204 into fiat files (220-226). 

associated with the snapshot time. At stcp 516> the coordinator i auac hes delete workers 

FIG. 4 depicts a block diagram of a system 400 that is (242-248) to delete "stale data" contained in the snapshot 
used for producing a copy of a database in accordance with 3Q tables in snapshot table database 240. In deleting the "stale 

an embodiment of the present invention. FIG. 4 is similar to data", the delete workers use the copy table list 218 to 

FIG. 2, and therefore like components have been numbered identify which snapshot tables in snapshot table database 

au ^ e - 240 need to be deleted. In one embodiment, the delete 

The system 400 differs from that shown in FIG. 2 in that workers (242-248) respectively update the delete start time 
system 400 includes a single snapshot worker 206 which is 35 252 and delete complete time 254 parameters for the cor- 

used to retrieve data from database 204. By using a single responding entry as they begin and complete the deletion of 

snapshot worker 206, only a single snapshot time is a particular table identified in copy table list 218. 

required. Because only a single snapshot time is required, For example, the delete worker 242 may use the infor- 

the coordinator 216 is not required to lock the database mation at entry 258 of copy table list 218 to identify a 
tables in database 204. ^ snapshot table "7" in snapshot table database 240 as a table 

FIG. 5 A and 5B are flow diagrams illustrating a method that needs to be deleted. The delete worker 242 then stores 

for producing a copy of desired data from database 204, the current system time ("1000" in this example) in the 

without locking the database tables that contain the desired delete start time 252 parameter of entry 258 and begins 

data, according to an embodiment of the invention. For deleting the snapshot table "7". When delete worker 242 

explanation purposes, the flow diagrams of FIGS. 5A and 5B 45 completes the deletion of snapshot table "7", it stores the 

are described with reference to the components of FIG. 4. current system time ("1130" in this example) in the delete 

At step 502, a memory based planner 202 is executed by complete time 254 parameter of entry 258. 

a user to generate a planning schedule that is based upon It should be noted that although the delete workers 

data from database 204. At step 504, the memory based (242-248) are launched at step 516 in this example, the 

planner 202 causes a snapshot worker 206 to be launched. 50 coordinator 216 may actually launch them prior to or even 

At step 506, the snapshot worker 206 determines which after step 516. For example, in one embodiment, the coor- 

tables of database 204 are required by the memory based dinator 216 launches the delete workers (242-248) imme- 

planner 202 for generating the planning schedule. Using this diately after it is notified that the copy table list 218 has been 

information, the snapshot worker 206 generates a copy table created in step 508. 

list 218. 55 At step 518, the snapshot worker 206 notifies the coor- 

In one embodiment, each entry in the copy table list 218 dinator 216 it has finished copying the data it was assigned, 

includes a table ID 250, a delete start time 252 and a delete In one embodiment, the snapshot worker informs the coor- 

complete time 254. The table ID 250 identifies a particular dinator 216 of the data that was copied (i.e. the particular 

table in database 204 that is required by the memory based database tables in database 204) and the location of the one 

planner 202 for the generation of the planing schedule. The 60 or more flat files that contains the copied data. As illustrated 

delete start time 252 identifies a system time stamp in which by flat files (220-226), snapshot worker 206 may copy its 

a delete worker began to delete a corresponding snapshot assigned data to multiple flat files, informing the coordinator 

table (to remove "stale data") in snapshot table database 240, 216 each time one of the flat files has been completed, 

as identified by the particular table ID. For example, entry At step 520, the coordinator 216 determines whether the 

256 of copy table list 218 indicates that a delete worker 65 delete workers (242-248) have deleted the snapshot tables 

began to delete snapshot table 15 in snapshot table database (i.e. removed the "stale data") in the snapshot table database 

240 at the system time of 900. 240 that correspond to the data that the snapshot worker has 
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just completed copying. For example, if the snapshot worker 
206 informs the coordinator 216 that database tables "1", 
"5", and "8" in database 204 have just been respectively 
copied to the fiat files 222, 224 and 226, the coordinator 216 
determines whether the delete workers (242-248) have 
deleted snapshot tables "1", "5" and "8" in snapshot table 
database 240, thereby removing the "stale data". 

In certain embodiments, the coordinator 216 uses the 
information in the copy table list 218 to determine if a 
particular snapshot table has been deleted. In one 
embodiment, the coordinator 216 uses the delete complete 
time 254 parameter to determine if a snapshot table has been 
deleted. 

If at step 520 the coordinator 216 determines that a 
snapshot table has not been deleted (i.e. snapshot tables "1", 
"5" and "8" for the previous example), then at step 522, the 
coordinator 216 delays the launching (e.g. spawning) of a 
loader worker (230-238) for loading the information until 
the snapshot tables "1", "5" and "8" have been deleted. For 
example, assuming that snapshot worker 206 has notified 
coordinator 216 that it has copied database tables "1", "5" 
and "8" from database 204 into flat file 220, if coordinator 
216 determines that snapshot table "5 M in snapshot table 
database 240 has not been deleted (i.e. the "stale data" has 
not been removed), then coordinator 216 will delay launch- 
ing loader worker 230 for loading the information into 
snapshot tables "1", "5" and "8", until snapshot table "5" has 
been deleted. Control then proceeds to step 534. 

If at step 520, the coordinator 216 determines that the 
snapshoUable has been deleted, then at step 524 the coor- 
dinator 216 launches a loader worker (230-238) to load the 
information from the flat file into the corresponding snap- 
shot table in snapshot table database 240. 

At step 526, when a loader worker (230-238) finishes 
loading a flat file (220-228) into its corresponding snapshot 
file in snapshot table database 240, it notifies the coordinator 
216. In certain embodiments, the loader workers (230-238) 
notify the coordinator 216 of the particular snapshot table in 
which the data was loaded. 

At step 528, the coordinator 216 determines whether all 
the desired tables identified in copy table list 218 have been 
copied from database 204 into the snapshot tables in snap- 
shot table database 240. 

If at step 528 the coordinator 216 determines that all the 
tables identified in copy table list 218 have been copied, then 
at step 530, the coordinator 216 notifies the memory based 
planner 202 that the snapshot table database 240 contains a 
valid copy of the desired tables from database 204. At step 
532, the memory based planner 202 uses the snapshot tables 
in snapshot table database 240 to generate the planning 
schedule. 

Conversely, if at step 528 the coordinator 216 determines 
that all the tables identified in copy table list 218 have not 
been copied, then at step 534, the coordinator 216 continues 
to accept and process other completion notifications from 
snapshot workers (206-214). When the coordinator 216 
receives a notification that a snapshot worker (206-214) has 
finished copying its assigned data to a particular flat file, 
control proceeds to step 520 to determine whether the delete 
workers (242-248) have deleted the corresponding snapshot 
tables in the snapshot table database 240. 

Copying Data Directly Into Another Database 

Certain database systems provide snapshot mechanisms 65 
for transactions that modify the database, as well as for "read 
only" transactions. Such a mechanism is described in U.S. 
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patent application Ser. No. 08/613,026, filed Mar. 11, 1996, 
Entitled "Method and Apparatus for Providing Isolation 
Levels in a Data", the contents of which are incorporated 
herein by this reference. In certain embodiments, such a 
snapshot mechanism is used to allow the snapshot workers 
to copy the desired tables into a database without the use of 
flat files. 

FIG. 6 depicts a block diagram of a system 600 that is 
used for producing a copy of a database in accordance with 
an embodiment of the present invention. FIG. 6 is similar to 
FIG. 2, and therefore like components have been numbered 
alike. 

As illustrated in FIG. 6, snapshot workers (206-214) 
retrieve database table information from database 204 and 
copy it directly into the snapshot table database 240. By 
writing directly to the snapshot table database 240, the 
intermediate steps of copying data to flat files and launching 
loader workers to copy that data from the flat files into the 
snapshot table database 240 can be eliminated. In one 
embodiment, the snapshot workers (206-214) copy the 
database tables (1..N) in database 204 directly into tempo- 
rary files in snapshot table database 240. These temporary 
files are used by the memory based planner 202 for gener- 
ating a planning schedule. 

In an alternative embodiment, the snapshot workers 
(206-214) copy the desired database tables (1..N) in data- 
base 204 directly into snapshot table database 240 as "large 
binary objects" (BLOBs). In general, a BLOB is a data item 
that consists of a large amount of data and may be used by 
systems that require data types that are typically much larger 
than traditional data types. For example, a single BLOB may 
include four gigabytes of data and may be thought of as a file 
or a stream of characters or bytes. 

In certain embodiments, snapshot workers (206-214) 
retrieve database table information from database 204 and 
copy it as separate data back into database 204. By copying 
the database table information from database 204 as separate 
data in database 204, the intermediate steps of copying the 
data to flat files can be eliminated. In one embodiment, the 
desired database tables (1..N) in database 204 are direcdy 
copied as separate BLOBs in database 204. 

In the foregoing specification, the invention has been 
described with reference to specific embodiments thereof. It 
will, however, be evident that various modifications and 
changes may be made thereto without departing from the 
broader spirit and scope of the invention. The specification 
and drawings are, accordingly, to be regarded in an illus- 
trative rather than a restrictive sense. 
What is claimed is: 

1. A method for supplying a consistent set of data to a 
software application, the method comprising the steps of: 

launching said software application; 

identifying a particular set of data that is required by the 
software application; 

requesting a first process to obtain a snapshot time from 
a database server associated with a first database, 
wherein the snapshot time causes all subsequent reads 
of said first database by the first process to return data 
that reflects a database state associated with the snap- 
shot time; 

after the first process obtains the snapshot time, causing 
the first process to extract the particular set of data from 
the first database; and 

supplying said software application with the particular set 
of data that was extracted from the first database. 



